##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
## Loading required package: lpSolve
## Welcome to emmeans.
## Caution: You lose important information if you filter this package's results.
## See '? untidy'
Distribution of participants by SPM English grade bands
| SPM English Band | n | Percentage |
|---|---|---|
| A-range | 9 | 26.5% |
| B-range | 8 | 23.5% |
| C/D-range | 17 | 50% |
Summary of participant retention for pre-post speaking tests:
Summary of participant retention for pre-post writing tests:
Distribution of pre-test speaking test prompts:
##
## PreS001 PreS002 PreS003 PreS004 PreS005 PreS006 PreS007 PreS008 PreS009 PreS010
## 5 4 4 4 5 3 2 2 2 2
## <NA>
## 1
Distribution of post-test speaking test prompts:
##
## PostS011 PostS012 PostS013 PostS014 PostS015 PostS016 PostS017 PostS018
## 5 3 4 4 4 5 4 1
## <NA>
## 4
Distribution of pre-test speaking scores:
Distribution of post-test speaking scores:
Comparison of pre-intervention and post-intervention performance:
The following is pre-post speaking change stats without p-value and effect size (to be reported separately).
The following is distribution of pre-post speaking test change by SPM English grade segments.
Distribution of pre-test writing scores:
Distribution of post-test writing scores:
Comparison of pre-intervention and post-intervention performance:
The following is pre-post writing change stats without p-value and effect size (to be reported separately).
The following is distribution of pre-post writing test change by SPM English grade segments.
We select randomly from each segment while adhering to stratification (4:4:7):
ICC analysis output for pre-speaking test:
## Single Score Intraclass Correlation
##
## Model: twoway
## Type : agreement
##
## Subjects = 15
## Raters = 2
## ICC(A,1) = 0.895
##
## F-Test, H0: r0 = 0 ; H1: r0 > 0
## F(14,14) = 16.9 , p = 2.17e-06
##
## 95%-Confidence Interval for ICC Population Values:
## 0.714 < ICC < 0.963
ICC analysis output for post-speaking test:
## Single Score Intraclass Correlation
##
## Model: twoway
## Type : agreement
##
## Subjects = 15
## Raters = 2
## ICC(A,1) = 0.812
##
## F-Test, H0: r0 = 0 ; H1: r0 > 0
## F(14,4.17) = 17.1 , p = 0.00604
##
## 95%-Confidence Interval for ICC Population Values:
## 0.226 < ICC < 0.946
ICC analysis output for combined pre-post speaking tests:
## Single Score Intraclass Correlation
##
## Model: twoway
## Type : agreement
##
## Subjects = 30
## Raters = 2
## ICC(A,1) = 0.837
##
## F-Test, H0: r0 = 0 ; H1: r0 > 0
## F(29,18.8) = 13.2 , p = 1.64e-07
##
## 95%-Confidence Interval for ICC Population Values:
## 0.652 < ICC < 0.923
Examining pre and post speaking test score ranges:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.50 6.75 11.00 10.83 13.25 18.50
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.00 7.75 11.00 10.87 13.75 17.50
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.00 10.50 13.50 13.40 17.25 20.50
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.00 6.50 12.50 10.90 14.75 21.50
We select randomly from each segment while adhering to stratification (4:4:7):
ICC analysis output for pre-writing test:
## Single Score Intraclass Correlation
##
## Model: twoway
## Type : agreement
##
## Subjects = 15
## Raters = 2
## ICC(A,1) = 0.377
##
## F-Test, H0: r0 = 0 ; H1: r0 > 0
## F(14,15) = 2.28 , p = 0.0629
##
## 95%-Confidence Interval for ICC Population Values:
## -0.112 < ICC < 0.73
ICC analysis output for post-writing test:
## Single Score Intraclass Correlation
##
## Model: twoway
## Type : agreement
##
## Subjects = 15
## Raters = 2
## ICC(A,1) = 0.826
##
## F-Test, H0: r0 = 0 ; H1: r0 > 0
## F(14,4.25) = 18.4 , p = 0.0048
##
## 95%-Confidence Interval for ICC Population Values:
## 0.264 < ICC < 0.95
ICC analysis output for combined pre-post writing tests:
## Single Score Intraclass Correlation
##
## Model: twoway
## Type : agreement
##
## Subjects = 30
## Raters = 2
## ICC(A,1) = 0.724
##
## F-Test, H0: r0 = 0 ; H1: r0 > 0
## F(29,14) = 7.83 , p = 0.000108
##
## 95%-Confidence Interval for ICC Population Values:
## 0.415 < ICC < 0.87
Examining pre and post writing test score ranges.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.0 3.0 5.0 4.4 5.0 12.0
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 2.000 3.000 3.467 4.000 10.000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 4.000 5.000 6.667 8.000 15.000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 2.000 4.000 4.933 6.500 13.000
##
## Shapiro-Wilk normality test
##
## data: df_s$pre_post_s_gain
## W = 0.9434, p-value = 0.1123
t-test report:
##
## Paired t-test
##
## data: df_s_p$post_s_score and df_s_p$pre_s_score
## t = 5.5017, df = 29, p-value = 6.297e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 1.863823 4.069511
## sample estimates:
## mean difference
## 2.966667
Wilcoxon signed-rank test report:
##
## Wilcoxon signed rank test with continuity correction
##
## data: df_s_p$post_s_score and df_s_p$pre_s_score
## V = 399, p-value = 8.953e-05
## alternative hypothesis: true location shift is not equal to 0
Effect sizes and 95% CI:
Summary of diff_s from df_s_p:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.500 0.625 3.500 2.967 5.000 7.500
## [1] -3.5 -3.0 -2.0 -1.5 0.0 0.5 0.5 0.5 1.0 2.0 2.5 3.0 3.0 3.0 3.5
## [16] 3.5 4.0 4.5 4.5 4.5 5.0 5.0 5.0 5.5 5.5 5.5 6.0 6.5 7.0 7.5
A quick summary for everything:
Visualizing effect size:
##
## Shapiro-Wilk normality test
##
## data: df_w$pre_post_w_gain
## W = 0.97297, p-value = 0.6232
t-test report:
##
## Paired t-test
##
## data: df_w_p$post_w_score and df_w_p$pre_w_score
## t = 1.6437, df = 29, p-value = 0.111
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -0.2442578 2.2442578
## sample estimates:
## mean difference
## 1
Wilcoxon signed-rank test report:
##
## Wilcoxon signed rank test with continuity correction
##
## data: df_w_p$post_w_score and df_w_p$pre_w_score
## V = 210.5, p-value = 0.08497
## alternative hypothesis: true location shift is not equal to 0
Effect sizes and 95% CI:
Summary of diff_w from df_w_p:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -8.00 -0.75 1.00 1.00 3.00 9.00
## [1] -8 -4 -3 -3 -2 -1 -1 -1 0 0 0 0 0 0 1 1 1 2 2 2 2 3 3 3 3
## [26] 5 5 5 6 9
A quick summary for everything:
Visualizing effect size:
H2a: Shapiro-Wilk test for normality of residuals:
##
## Shapiro-Wilk normality test
##
## data: resid(m_h2a)
## W = 0.98159, p-value = 0.866
H2a: Levenes Test for homogeneity of variances:
H2a: Checking Homogeneity of Regression Slopes:
Fitting a linear regression model to explain speaking score change using SPM English grade and pre-test score:
##
## Call:
## lm(formula = gain_s ~ spm_eng_band + pre_s_score, data = df_s_p)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.7027 -2.1393 0.1355 1.7128 4.9240
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.9204 2.9857 2.653 0.0134 *
## spm_eng_bandB-range -2.1319 1.6952 -1.258 0.2197
## spm_eng_bandC/D-range -3.0911 1.8827 -1.642 0.1127
## pre_s_score -0.2957 0.1954 -1.513 0.1422
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.956 on 26 degrees of freedom
## Multiple R-squared: 0.1021, Adjusted R-squared: -0.001514
## F-statistic: 0.9854 on 3 and 26 DF, p-value: 0.415
Doing Type II ANOVA to check if SPM grade segment has overall effect on gains, while accounting for pre-test scores:
Checking effect size for each variate:
Checking which group(s) differ significantly, if there is/are any:
## contrast estimate SE df t.ratio p.value
## (A-range) - (B-range) 2.132 1.70 26 1.258 0.4312
## (A-range) - (C/D-range) 3.091 1.88 26 1.642 0.2466
## (B-range) - (C/D-range) 0.959 1.41 26 0.680 0.7769
##
## P value adjustment: tukey method for comparing a family of 3 estimates
H2b: Shapiro-Wilk test for normality of residuals:
##
## Shapiro-Wilk normality test
##
## data: resid(m_h2b)
## W = 0.98223, p-value = 0.8813
H2b: Levenes Test for homogeneity of variances:
H2b: Checking Homogeneity of Regression Slopes:
Fitting a linear regression model to explain writing score change using SPM English grade and pre-test score:
##
## Call:
## lm(formula = gain_w ~ spm_eng_band + pre_w_score, data = df_w_p)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.6085 -1.9136 -0.0713 1.6185 5.4485
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.8955 2.3741 3.326 0.00263 **
## spm_eng_bandB-range -5.2805 1.7340 -3.045 0.00527 **
## spm_eng_bandC/D-range -5.3615 1.9268 -2.783 0.00991 **
## pre_w_score -0.5563 0.2301 -2.417 0.02294 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.974 on 26 degrees of freedom
## Multiple R-squared: 0.2856, Adjusted R-squared: 0.2032
## F-statistic: 3.465 on 3 and 26 DF, p-value: 0.03057
Doing Type II ANOVA to check if SPM grade segment has overall effect on gains, while accounting for pre-test scores:
Checking effect size for each variate:
Checking which group(s) differ significantly, if there is/are any:
## contrast estimate SE df t.ratio p.value
## (A-range) - (B-range) 5.281 1.73 26 3.045 0.0141
## (A-range) - (C/D-range) 5.361 1.93 26 2.783 0.0259
## (B-range) - (C/D-range) 0.081 1.41 26 0.057 0.9982
##
## P value adjustment: tukey method for comparing a family of 3 estimates
H1a sensitivity check: paired t-test using hybrid scores
##
## Paired t-test
##
## data: df_s_rbchk$post_s_used and df_s_rbchk$pre_s_used
## t = 5.2019, df = 29, p-value = 1.447e-05
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 1.415945 3.250722
## sample estimates:
## mean difference
## 2.333333
H1b sensitivity check: paired t-test using hybrid scores
##
## Paired t-test
##
## data: df_w_rbchk$post_w_used and df_w_rbchk$pre_w_used
## t = 1.4157, df = 29, p-value = 0.1675
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -0.3557092 1.9557092
## sample estimates:
## mean difference
## 0.8
H2a sensitivity check: baseline-adjusted model using hybrid scores
##
## Call:
## lm(formula = gain_s_used ~ spm_eng_band + pre_s_used, data = df_s_rbchk)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.1645 -1.5887 0.0267 1.5183 4.6469
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.8711 2.4835 2.767 0.0103 *
## spm_eng_bandB-range -1.9676 1.3842 -1.421 0.1671
## spm_eng_bandC/D-range -3.2704 1.5609 -2.095 0.0460 *
## pre_s_used -0.2497 0.1624 -1.538 0.1361
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.4 on 26 degrees of freedom
## Multiple R-squared: 0.1446, Adjusted R-squared: 0.04585
## F-statistic: 1.464 on 3 and 26 DF, p-value: 0.2472
H2a sensitivity check: Type II ANOVA
H2b sensitivity check: baseline-adjusted model using hybrid scores
##
## Call:
## lm(formula = gain_w_used ~ spm_eng_band + pre_w_used, data = df_w_rbchk)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.634 -1.525 -0.150 1.121 5.162
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.3365 2.2889 2.768 0.01025 *
## spm_eng_bandB-range -4.8037 1.6732 -2.871 0.00803 **
## spm_eng_bandC/D-range -4.3535 1.8733 -2.324 0.02821 *
## pre_w_used -0.4332 0.2286 -1.895 0.06920 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.839 on 26 degrees of freedom
## Multiple R-squared: 0.2456, Adjusted R-squared: 0.1585
## F-statistic: 2.821 on 3 and 26 DF, p-value: 0.05856
H2b sensitivity check: Type II ANOVA
| Total N | Mean Quality | SD | Min | Max | N (Robust) | Avg Sessions |
|---|---|---|---|---|---|---|
| 34 | 4.32 | 1.13 | 2 | 6.6 | 30 | 4.15 |
##
## Shapiro-Wilk normality test
##
## data: resid(m_h3a)
## W = 0.98108, p-value = 0.8536
## Non-constant Variance Score Test
## Variance formula: ~ fitted.values
## Chisquare = 2.589108, Df = 1, p = 0.1076
## GVIF Df GVIF^(1/(2*Df))
## padlet_mean 1.223071 1 1.105925
## pre_s_score 2.080424 1 1.442367
## spm_eng_band 2.248739 2 1.224573
##
## Call:
## lm(formula = gain_s ~ padlet_mean + pre_s_score + spm_eng_band,
## data = df_s_reg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.7456 -2.2003 0.2613 1.7483 4.7704
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.0309 3.9953 2.260 0.0328 *
## padlet_mean -0.2390 0.5596 -0.427 0.6729
## pre_s_score -0.2887 0.1992 -1.449 0.1598
## spm_eng_bandB-range -2.3469 1.7946 -1.308 0.2028
## spm_eng_bandC/D-range -3.2892 1.9684 -1.671 0.1072
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.003 on 25 degrees of freedom
## Multiple R-squared: 0.1086, Adjusted R-squared: -0.03403
## F-statistic: 0.7614 on 4 and 25 DF, p-value: 0.5602
## 2.5 % 97.5 %
## -1.3915156 0.9134816
##
## Call:
## lm(formula = gain_s ~ padlet_mean + pre_s_score + spm_eng_band,
## data = df_s_reg_robust)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.6668 -1.7960 0.1483 1.8679 4.9434
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.5480 4.1113 2.322 0.0294 *
## padlet_mean -0.2855 0.5740 -0.497 0.6236
## pre_s_score -0.3083 0.2065 -1.493 0.1490
## spm_eng_bandB-range -2.1229 1.8689 -1.136 0.2677
## spm_eng_bandC/D-range -3.6003 2.0631 -1.745 0.0943 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.062 on 23 degrees of freedom
## Multiple R-squared: 0.125, Adjusted R-squared: -0.02717
## F-statistic: 0.8215 on 4 and 23 DF, p-value: 0.5248
##
## Shapiro-Wilk normality test
##
## data: resid(m_h3b)
## W = 0.98023, p-value = 0.8316
## Non-constant Variance Score Test
## Variance formula: ~ fitted.values
## Chisquare = 0.1099283, Df = 1, p = 0.74023
## GVIF Df GVIF^(1/(2*Df))
## padlet_mean 1.261764 1 1.123283
## pre_w_score 2.182084 1 1.477188
## spm_eng_band 2.280277 2 1.228844
##
## Call:
## lm(formula = gain_w ~ padlet_mean + pre_w_score + spm_eng_band,
## data = df_w_reg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.8675 -1.6545 -0.1785 1.2300 4.7741
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.8624 3.2018 0.894 0.37985
## padlet_mean 1.1127 0.5102 2.181 0.03882 *
## pre_w_score -0.6249 0.2174 -2.875 0.00815 **
## spm_eng_bandB-range -4.3781 1.6729 -2.617 0.01484 *
## spm_eng_bandC/D-range -4.5238 1.8416 -2.456 0.02133 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.78 on 25 degrees of freedom
## Multiple R-squared: 0.3998, Adjusted R-squared: 0.3038
## F-statistic: 4.163 on 4 and 25 DF, p-value: 0.01015
## 2.5 % 97.5 %
## 0.06187987 2.16349535
##
## Call:
## lm(formula = gain_w ~ padlet_mean + pre_w_score + spm_eng_band,
## data = df_w_reg_robust)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.4084 -1.4596 -0.2434 1.4392 5.4064
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.3611 3.2152 0.734 0.47016
## padlet_mean 1.1867 0.5111 2.322 0.02946 *
## pre_w_score -0.6114 0.2191 -2.791 0.01039 *
## spm_eng_bandB-range -4.7898 1.6986 -2.820 0.00971 **
## spm_eng_bandC/D-range -4.3048 1.8798 -2.290 0.03152 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.77 on 23 degrees of freedom
## Multiple R-squared: 0.4378, Adjusted R-squared: 0.34
## F-statistic: 4.477 on 4 and 23 DF, p-value: 0.008028